Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(file-log) add timeout to reopen files automatically #2348

Closed
wants to merge 1 commit into from

Conversation

Tieske
Copy link
Member

@Tieske Tieske commented Apr 6, 2017

If files got rotated, the plugin would keep writing to the existing file-descriptor, instead of recreating a new file. The new valid property (in seconds, default 60) will automatically reopen the file after the specified number of seconds.

This is modeled after the nginx directive open_log_file_cache, which has a similar valid property, with the same default.

Remains: this plugin is not really suitable for production use. There are too many issues related to blocking io, file errors, etc. Those have been properly solved by tools like syslog etc. so those remain the preferred solution.

Issues resolved

Fix #2325

@Tieske Tieske requested a review from subnetmarco April 6, 2017 13:36
@subnetmarco
Copy link
Member

subnetmarco commented Apr 6, 2017

  1. Does this mean that the file will be re-opened even if there is no need to do so?
  2. What happens if valid=60 and the file is rotated at the 30 second? Will we be losing 30 seconds worth of logs until valid kicks in and reopens the file?

@Tieske
Copy link
Member Author

Tieske commented Apr 6, 2017

  1. Does this mean that the file will be re-opened even if there is no need to do so?

yes. if there is 60 seconds without any request, then in second 61, a request comes in, it will be closed and opened again.

  1. What happens if valid=60 and the file is rotated at the 30 second? Will we be losing 30 seconds worth of logs until valid kicks in and reopens the file?

Same as what happens now, gets written to the old file-descriptor, so yes, gets lost. Can be mitigated by setting it to 1 second for example. Less gets lost, but will be (slightly) slower

There is no simple way around this, other than rebuild syslog inside openresty. This is the pretty much the best we can do, if that's not enough, stop using it and use the syslog plugin instead.

@p0pr0ck5
Copy link
Contributor

p0pr0ck5 commented Apr 7, 2017

Rather than worrying about a single FD, why don't we have the plugin buffer the contents and asynchronously flush the buffer to disk. This is the same model as used by the buffer directive of the nginx access_log directive, doesn't result lost data points, and resolves the rotated fd bug. The overhead of opening and closing the fd on a recurring basis at this scale (given a flush time on the order of seconds) is negligible.

@bungle
Copy link
Member

bungle commented Apr 7, 2017

A few possibilities:

  1. forget caching fds (OS will do that anyway —should be the default, this is mainly for debugging anyway)
  2. append to shm and have a recurring non-fd caching timer to flush that to file (with exclusive locking on workers, use lua-resty-lock)

I feel reopening files is not the biggest issue anyway, compared to blocking io in general.

I'm actually not sure if there is issue also with multiple workers writing the same file, and get output mixed (do writers hold exclusive locks?)

@bungle
Copy link
Member

bungle commented Apr 7, 2017

Not 100% sure but that string_to_char is not needed as LuaJIT will do it automatically — if you use correct signature for open, it is missing const.

assert.is_nil(err)
assert(not file_log:find(uuid1, nil, true), "did not expected 1st request in logfile")
assert(file_log:find(uuid2, nil, true), "expected 2nd request in logfile")
assert(file_log:find(uuid2, nil, true), "expected 3rd request in logfile")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buggy test, this should be uuid3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed!

If files got rotated, the plugin would keep writing to the existing
filedescriptor, instead of recreating a new file.
The new `valid` property (in seconds) will automatically reopen the file
after the specified number of seconds.
@Tieske
Copy link
Member Author

Tieske commented Apr 18, 2017

@thefosk If you don't want any lost data, setting the timeout to 0 will close and reopen on every request. Hence when rotating, set it temporarily to 0, rotate the log, and set it back to a higher value. That ought to work without loosing any data.

@p0pr0ck5 @bungle Though the ideas mentioned do make sense, in the end they don't resolve the underlying usage of blocking i/o. Hence this is the minimum viable fix for the file-log plugin. I don't feel any more effort spend on this is worthwhile as they don't fix the blocking i/o. So imo if this is not good enough, use syslog.

@p0pr0ck5
Copy link
Contributor

set it temporarily to 0, rotate the log, and set it back to a higher value. That ought to work without loosing any data.

This seems like an unnecessarily complex and arbitrary step needed to simply rotate a file. Do really want to tell everyone that wants to rotate logs like this to jump through these hoops?

in the end they don't resolve the underlying usage of blocking i/o.

Neither does this PR :) Blocking I/O isn't the question here. Stale FDs is.

Hence this is the minimum viable fix for the file-log plugin.

I disagree. The simplest fix to the issue of stale FDs would be to simply not cache the FD at all. A recurring close/open adds additional complexity and overhead.

@p0pr0ck5
Copy link
Contributor

Another approach that comes to mind would be to provide an API hook to cycle the FDs, similar to what SIGUSR1 does now. This is a bit more complex, admittedly, but is truer to how Nginx handles log files currently.

@john-paul-keenan
Copy link

I agree with @p0pr0ck5 that this does not seem necessary and would be asking our customers to do a lot of manual work.

This is becoming a serious issue (potential showstopper) for some users.

@Tieske
Copy link
Member Author

Tieske commented Apr 21, 2017

In that case we can change the default setting to 0, which has the same effect. If performance is an issue for the customer, they can use the more complex route.

Note: the default is in the code in 2 locations, here and here

@bungle
Copy link
Member

bungle commented Apr 21, 2017

Do we still have a concern about mixed output aka two workers writing the file the same time, and not holding any locks. Maybe it is not related here, but I can see that coming next. Also, still good idea to change open signature, and not use string_to_char with it.

@Tieske Tieske added this to the 0.10.2 milestone Apr 21, 2017
@p0pr0ck5
Copy link
Contributor

p0pr0ck5 commented Apr 21, 2017

Closing this in favor of a strategy to optionally close the fd on each invocation, per conversations with @Tieske and @thefosk.

@john-paul-keenan a gentle reminder that this plugin is highly unsuitable for high-performance productions environments, not as a result of the design of the plugin per se, but the nature of disk access and OpenResty's architecture and idiomatic paradigms :)

@p0pr0ck5 p0pr0ck5 closed this Apr 21, 2017
@Tieske Tieske mentioned this pull request Sep 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

File log
6 participants